{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Labeling: Excess Return Over Median\n",
    "\n",
    "![image_example](img/distribution_over_median.png)\n",
    "_*Fig. 1:*_ Distribution of excess over median return for 22 stock tickers from period between Jan 2019 and May 2020.\n",
    "\n",
    "## Abstract\n",
    "\n",
    "In this notebook, we demonstrate labeling financial returns data according to excess over median. Using cross-sectional data on returns of many different stocks, each observation is labeled according to whether (or how much) its return exceeds the median return. Correlations can then be found between features and the likelihood that a stock will outperform the market.\n",
    "\n",
    "## Introduction\n",
    "This technique is used in the following paper:\n",
    "[\"The benefits of tree-based models for stock selection\"](https://link.springer.com/article/10.1057/jam.2012.17) by _Zhu et al._ (2012). \n",
    "\n",
    "In that paper, independent composite features are constructed as weighted averages of various parameters in fundamental and quantitative analysis, such as PE ratio, corporate cash flows, debt etc. The composite features are applied as parameters in linear regression or a decision tree to predict whether a stock will outperform the market median return.\n",
    "\n",
    "\n",
    "## How it works\n",
    "\n",
    "A dataframe containing forward total stock returns is calculated from close prices. The median return of all companies at time $t$ in the dataframe is used to represent the market return, and excess returns are calculated by subtracting the median return from each stock's return over the time period $t$ \\[Zhu et al. 2012\\]. The numerical returns over median can then be used as is (for regression analysis), or can be relabeled simply to its sign (for classification analysis).\n",
    "\n",
    "At time $t$:\n",
    "\n",
    "$$P_t = \\{p_{t,0}, p_{t,1}, ..., p_{t,n}\\}$$\n",
    "$$m_t = median(P_t)$$\n",
    "$$L(P_t) = \\{p_{t,0} - m_t, p_{t,1} - m_t, ...,p_{t,n} - m_t\\}$$\n",
    "\n",
    "If categorical rather than numerical labels are desired:\n",
    "\n",
    "$$\n",
    "     \\begin{equation}\n",
    "     \\begin{split}\n",
    "       L(p_{t,n}) = \\begin{cases}\n",
    "       -1 &\\ \\text{if} \\ \\ p_{t,n} - m_t < 0\\\\\n",
    "       0 &\\ \\text{if} \\ \\ p_{t,n} - m_t = 0\\\\\n",
    "       1 &\\ \\text{if} \\ \\ p_{t,n} - m_t > 0\\\\\n",
    "       \\end{cases}\n",
    "     \\end{split}\n",
    "     \\end{equation}\n",
    "$$\n",
    "\n",
    "---\n",
    "## Examples of use"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import yfinance as yf\n",
    "\n",
    "from mlfinlab.labeling import excess_over_median\n",
    "\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[*********************100%***********************]  22 of 22 completed\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AMD</th>\n",
       "      <th>FB</th>\n",
       "      <th>BABA</th>\n",
       "      <th>NVDA</th>\n",
       "      <th>MSFT</th>\n",
       "      <th>AAL</th>\n",
       "      <th>C</th>\n",
       "      <th>JPM</th>\n",
       "      <th>COST</th>\n",
       "      <th>F</th>\n",
       "      <th>...</th>\n",
       "      <th>UA</th>\n",
       "      <th>ZM</th>\n",
       "      <th>UBER</th>\n",
       "      <th>NOK</th>\n",
       "      <th>VZ</th>\n",
       "      <th>AAPL</th>\n",
       "      <th>PFE</th>\n",
       "      <th>CVX</th>\n",
       "      <th>SYY</th>\n",
       "      <th>CCL</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2019-01-22</th>\n",
       "      <td>19.760000</td>\n",
       "      <td>147.570007</td>\n",
       "      <td>152.149994</td>\n",
       "      <td>148.102676</td>\n",
       "      <td>103.568062</td>\n",
       "      <td>32.219025</td>\n",
       "      <td>59.116608</td>\n",
       "      <td>98.963676</td>\n",
       "      <td>209.413116</td>\n",
       "      <td>7.837517</td>\n",
       "      <td>...</td>\n",
       "      <td>18.590000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5.846427</td>\n",
       "      <td>54.096535</td>\n",
       "      <td>150.266403</td>\n",
       "      <td>39.946537</td>\n",
       "      <td>105.162872</td>\n",
       "      <td>60.661575</td>\n",
       "      <td>51.750248</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-23</th>\n",
       "      <td>19.799999</td>\n",
       "      <td>144.300003</td>\n",
       "      <td>152.029999</td>\n",
       "      <td>148.620316</td>\n",
       "      <td>104.577469</td>\n",
       "      <td>31.146364</td>\n",
       "      <td>59.384232</td>\n",
       "      <td>98.713730</td>\n",
       "      <td>209.107468</td>\n",
       "      <td>7.689988</td>\n",
       "      <td>...</td>\n",
       "      <td>18.400000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5.924772</td>\n",
       "      <td>54.827435</td>\n",
       "      <td>150.874130</td>\n",
       "      <td>39.842587</td>\n",
       "      <td>104.273567</td>\n",
       "      <td>60.884239</td>\n",
       "      <td>51.512947</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-24</th>\n",
       "      <td>20.850000</td>\n",
       "      <td>145.830002</td>\n",
       "      <td>155.860001</td>\n",
       "      <td>157.131958</td>\n",
       "      <td>104.077667</td>\n",
       "      <td>33.124386</td>\n",
       "      <td>59.938599</td>\n",
       "      <td>98.771400</td>\n",
       "      <td>207.352509</td>\n",
       "      <td>7.929724</td>\n",
       "      <td>...</td>\n",
       "      <td>18.860001</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.032495</td>\n",
       "      <td>54.172474</td>\n",
       "      <td>149.678253</td>\n",
       "      <td>38.699097</td>\n",
       "      <td>106.258125</td>\n",
       "      <td>60.535717</td>\n",
       "      <td>52.224846</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-25</th>\n",
       "      <td>21.930000</td>\n",
       "      <td>149.009995</td>\n",
       "      <td>159.210007</td>\n",
       "      <td>159.431610</td>\n",
       "      <td>105.028282</td>\n",
       "      <td>34.423378</td>\n",
       "      <td>61.190701</td>\n",
       "      <td>99.396294</td>\n",
       "      <td>206.129929</td>\n",
       "      <td>8.169458</td>\n",
       "      <td>...</td>\n",
       "      <td>19.490000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.463387</td>\n",
       "      <td>53.536488</td>\n",
       "      <td>154.638153</td>\n",
       "      <td>38.406132</td>\n",
       "      <td>105.986641</td>\n",
       "      <td>60.041988</td>\n",
       "      <td>52.689953</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-28</th>\n",
       "      <td>20.180000</td>\n",
       "      <td>147.470001</td>\n",
       "      <td>158.919998</td>\n",
       "      <td>137.390930</td>\n",
       "      <td>102.980049</td>\n",
       "      <td>35.988075</td>\n",
       "      <td>61.028221</td>\n",
       "      <td>99.867378</td>\n",
       "      <td>207.806046</td>\n",
       "      <td>7.985046</td>\n",
       "      <td>...</td>\n",
       "      <td>19.370001</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.355664</td>\n",
       "      <td>52.274014</td>\n",
       "      <td>153.207047</td>\n",
       "      <td>37.357147</td>\n",
       "      <td>105.003731</td>\n",
       "      <td>60.284012</td>\n",
       "      <td>53.534740</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 22 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                  AMD          FB        BABA        NVDA        MSFT  \\\n",
       "Date                                                                    \n",
       "2019-01-22  19.760000  147.570007  152.149994  148.102676  103.568062   \n",
       "2019-01-23  19.799999  144.300003  152.029999  148.620316  104.577469   \n",
       "2019-01-24  20.850000  145.830002  155.860001  157.131958  104.077667   \n",
       "2019-01-25  21.930000  149.009995  159.210007  159.431610  105.028282   \n",
       "2019-01-28  20.180000  147.470001  158.919998  137.390930  102.980049   \n",
       "\n",
       "                  AAL          C        JPM        COST         F  ...  \\\n",
       "Date                                                               ...   \n",
       "2019-01-22  32.219025  59.116608  98.963676  209.413116  7.837517  ...   \n",
       "2019-01-23  31.146364  59.384232  98.713730  209.107468  7.689988  ...   \n",
       "2019-01-24  33.124386  59.938599  98.771400  207.352509  7.929724  ...   \n",
       "2019-01-25  34.423378  61.190701  99.396294  206.129929  8.169458  ...   \n",
       "2019-01-28  35.988075  61.028221  99.867378  207.806046  7.985046  ...   \n",
       "\n",
       "                   UA  ZM  UBER       NOK         VZ        AAPL        PFE  \\\n",
       "Date                                                                          \n",
       "2019-01-22  18.590000 NaN   NaN  5.846427  54.096535  150.266403  39.946537   \n",
       "2019-01-23  18.400000 NaN   NaN  5.924772  54.827435  150.874130  39.842587   \n",
       "2019-01-24  18.860001 NaN   NaN  6.032495  54.172474  149.678253  38.699097   \n",
       "2019-01-25  19.490000 NaN   NaN  6.463387  53.536488  154.638153  38.406132   \n",
       "2019-01-28  19.370001 NaN   NaN  6.355664  52.274014  153.207047  37.357147   \n",
       "\n",
       "                   CVX        SYY        CCL  \n",
       "Date                                          \n",
       "2019-01-22  105.162872  60.661575  51.750248  \n",
       "2019-01-23  104.273567  60.884239  51.512947  \n",
       "2019-01-24  106.258125  60.535717  52.224846  \n",
       "2019-01-25  105.986641  60.041988  52.689953  \n",
       "2019-01-28  105.003731  60.284012  53.534740  \n",
       "\n",
       "[5 rows x 22 columns]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Load price data for 22 stocks\n",
    "tickers = \"AAPL MSFT COST PFE SYY F GE BABA AMD CCL ZM FB WFC JPM NVDA CVX AAL UBER C UA VZ NOK\"\n",
    "\n",
    "data = yf.download(tickers, start=\"2019-01-20\", end=\"2020-05-25\",\n",
    "                   group_by=\"ticker\")\n",
    "data = data.loc[:, (slice(None), 'Adj Close')]\n",
    "data.columns = data.columns.droplevel(1)\n",
    "data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We find the excess return over median for all tickers in the time period, calculate the mean and standard deviation of returns, and plot the distribution."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AMD</th>\n",
       "      <th>FB</th>\n",
       "      <th>BABA</th>\n",
       "      <th>NVDA</th>\n",
       "      <th>MSFT</th>\n",
       "      <th>AAL</th>\n",
       "      <th>C</th>\n",
       "      <th>JPM</th>\n",
       "      <th>COST</th>\n",
       "      <th>F</th>\n",
       "      <th>...</th>\n",
       "      <th>UA</th>\n",
       "      <th>ZM</th>\n",
       "      <th>UBER</th>\n",
       "      <th>NOK</th>\n",
       "      <th>VZ</th>\n",
       "      <th>AAPL</th>\n",
       "      <th>PFE</th>\n",
       "      <th>CVX</th>\n",
       "      <th>SYY</th>\n",
       "      <th>CCL</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2019-01-22</th>\n",
       "      <td>0.001406</td>\n",
       "      <td>-0.022777</td>\n",
       "      <td>-0.001406</td>\n",
       "      <td>0.002877</td>\n",
       "      <td>0.009129</td>\n",
       "      <td>-0.033911</td>\n",
       "      <td>0.003909</td>\n",
       "      <td>-0.003143</td>\n",
       "      <td>-0.002077</td>\n",
       "      <td>-0.019441</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.010838</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.012783</td>\n",
       "      <td>0.012893</td>\n",
       "      <td>0.003427</td>\n",
       "      <td>-0.003220</td>\n",
       "      <td>-0.009074</td>\n",
       "      <td>0.003053</td>\n",
       "      <td>-0.005203</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-23</th>\n",
       "      <td>0.043061</td>\n",
       "      <td>0.000634</td>\n",
       "      <td>0.015223</td>\n",
       "      <td>0.047302</td>\n",
       "      <td>-0.014748</td>\n",
       "      <td>0.053538</td>\n",
       "      <td>-0.000634</td>\n",
       "      <td>-0.009385</td>\n",
       "      <td>-0.018362</td>\n",
       "      <td>0.021206</td>\n",
       "      <td>...</td>\n",
       "      <td>0.015031</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.008213</td>\n",
       "      <td>-0.021915</td>\n",
       "      <td>-0.017895</td>\n",
       "      <td>-0.038669</td>\n",
       "      <td>0.009063</td>\n",
       "      <td>-0.015693</td>\n",
       "      <td>0.003851</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-24</th>\n",
       "      <td>0.034036</td>\n",
       "      <td>0.004044</td>\n",
       "      <td>0.003731</td>\n",
       "      <td>-0.003127</td>\n",
       "      <td>-0.008629</td>\n",
       "      <td>0.021453</td>\n",
       "      <td>0.003127</td>\n",
       "      <td>-0.011436</td>\n",
       "      <td>-0.023659</td>\n",
       "      <td>0.012470</td>\n",
       "      <td>...</td>\n",
       "      <td>0.015642</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.053666</td>\n",
       "      <td>-0.029502</td>\n",
       "      <td>0.015375</td>\n",
       "      <td>-0.025333</td>\n",
       "      <td>-0.020317</td>\n",
       "      <td>-0.025918</td>\n",
       "      <td>-0.008857</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-25</th>\n",
       "      <td>-0.070535</td>\n",
       "      <td>-0.001071</td>\n",
       "      <td>0.007443</td>\n",
       "      <td>-0.128981</td>\n",
       "      <td>-0.010238</td>\n",
       "      <td>0.054719</td>\n",
       "      <td>0.006609</td>\n",
       "      <td>0.014004</td>\n",
       "      <td>0.017396</td>\n",
       "      <td>-0.013309</td>\n",
       "      <td>...</td>\n",
       "      <td>0.003107</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.007402</td>\n",
       "      <td>-0.014317</td>\n",
       "      <td>0.000010</td>\n",
       "      <td>-0.018049</td>\n",
       "      <td>-0.000010</td>\n",
       "      <td>0.013295</td>\n",
       "      <td>0.025297</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-28</th>\n",
       "      <td>-0.040490</td>\n",
       "      <td>-0.016647</td>\n",
       "      <td>-0.007242</td>\n",
       "      <td>-0.040851</td>\n",
       "      <td>-0.014771</td>\n",
       "      <td>-0.002062</td>\n",
       "      <td>-0.004429</td>\n",
       "      <td>0.008386</td>\n",
       "      <td>0.003412</td>\n",
       "      <td>0.017142</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.017121</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.014840</td>\n",
       "      <td>-0.026909</td>\n",
       "      <td>-0.004770</td>\n",
       "      <td>0.036963</td>\n",
       "      <td>0.002564</td>\n",
       "      <td>0.002062</td>\n",
       "      <td>0.003112</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 22 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                 AMD        FB      BABA      NVDA      MSFT       AAL  \\\n",
       "Date                                                                     \n",
       "2019-01-22  0.001406 -0.022777 -0.001406  0.002877  0.009129 -0.033911   \n",
       "2019-01-23  0.043061  0.000634  0.015223  0.047302 -0.014748  0.053538   \n",
       "2019-01-24  0.034036  0.004044  0.003731 -0.003127 -0.008629  0.021453   \n",
       "2019-01-25 -0.070535 -0.001071  0.007443 -0.128981 -0.010238  0.054719   \n",
       "2019-01-28 -0.040490 -0.016647 -0.007242 -0.040851 -0.014771 -0.002062   \n",
       "\n",
       "                   C       JPM      COST         F  ...        UA  ZM  UBER  \\\n",
       "Date                                                ...                       \n",
       "2019-01-22  0.003909 -0.003143 -0.002077 -0.019441  ... -0.010838 NaN   NaN   \n",
       "2019-01-23 -0.000634 -0.009385 -0.018362  0.021206  ...  0.015031 NaN   NaN   \n",
       "2019-01-24  0.003127 -0.011436 -0.023659  0.012470  ...  0.015642 NaN   NaN   \n",
       "2019-01-25  0.006609  0.014004  0.017396 -0.013309  ...  0.003107 NaN   NaN   \n",
       "2019-01-28 -0.004429  0.008386  0.003412  0.017142  ... -0.017121 NaN   NaN   \n",
       "\n",
       "                 NOK        VZ      AAPL       PFE       CVX       SYY  \\\n",
       "Date                                                                     \n",
       "2019-01-22  0.012783  0.012893  0.003427 -0.003220 -0.009074  0.003053   \n",
       "2019-01-23  0.008213 -0.021915 -0.017895 -0.038669  0.009063 -0.015693   \n",
       "2019-01-24  0.053666 -0.029502  0.015375 -0.025333 -0.020317 -0.025918   \n",
       "2019-01-25 -0.007402 -0.014317  0.000010 -0.018049 -0.000010  0.013295   \n",
       "2019-01-28  0.014840 -0.026909 -0.004770  0.036963  0.002564  0.002062   \n",
       "\n",
       "                 CCL  \n",
       "Date                  \n",
       "2019-01-22 -0.005203  \n",
       "2019-01-23  0.003851  \n",
       "2019-01-24 -0.008857  \n",
       "2019-01-25  0.025297  \n",
       "2019-01-28  0.003112  \n",
       "\n",
       "[5 rows x 22 columns]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "excess1 = excess_over_median(data)\n",
    "excess1.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can visualize the distribution as a histogram."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZYAAAEWCAYAAABFSLFOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3de7gcVZnv8e+PBEOQS8IkICSBgMQLcDTABjziBQG5CaIzoIhKRJyI4jiOOIegjFyUEUeBGUZHAY0EFCGgaLgoBBCVUSABwiUgsoVIQkISLoEAMZDwnj/Waqh0unt371299+7w+zxPP921VtWqt6ur++26rVJEYGZmVpb1BjoAMzNbtzixmJlZqZxYzMysVE4sZmZWKicWMzMrlROLmZmVyomlZJK+L+nfSmpra0nPShqSh2+S9Kky2s7t/UrSpLLaa2G+X5f0uKTH+nve1jxJ4yWFpKF5uG3ri9cJkDRP0r4DHUcZnFhakD/4FZKWS1om6Q+SjpX08nKMiGMj4mtNttVwJYqIRyJio4hYXULsp0j6cVX7B0bEtL623WIc44DjgR0i4nU16veS9FJOqMslPSDp6Bbav0DS18uMucn5HizpNknPSXpC0k8kje3H+YekxZUkkMuGSloiqZSL1dq1vvS0TvSivc0l/VTSQklPS/pfSXsU6t8n6eb8HX5M0vmSNm7Q3jvyd/1pSU/m9nbLdZ+QdHNfY17XOLG07pCI2BjYBjgDOAH4YdkzKf5ArGO2AZ6IiCUNxlkYERsBmwD/Apwv6Y39EVxvlrukw4CLgf8CRgE7AiuBmyWN7Mf4lgEHFoYPAp4qc/5t0sw6UVOd5bERMAvYFdgMmAZcLWmjXL8p8HVgK+DNwFjgW3Xa3wS4Cvjv3NYY4FTS52v1RIQfTT6AecC+VWW7Ay8BO+XhC4Cv59ejSCvlMuBJ4PekZH5RnmYF8Czw/4DxQADHAI8AvyuUDc3t3QR8A7gNeBr4JbBZrtsLWFArXuAA4AXgxTy/uwrtfSq/Xg84CfgrsAS4ENg011XimJRjexz4SoPltGmefmlu76Tc/r75Pb+U47igxrS13scS4PDC8JuAmXmZPgB8KJdPzu/xhdz+lbk8gO0L0xc/o72ABaQ/CI/lz6ZSdnye9yLg6DrvVfk9/r+q8vWAe4HTgGF5HdipUD86L4vN8/DBwJw83h+At1R9jicAd5N+0IbWiCPycr6sUHY58BUgqj6bH+b39CjpB3ZIrhsCfDt/vg8Bx7H2+ldZX14P3Ag8kcf/CTCiKuYv5ZifBi4FNqgRd811Ang/MDcvj5uAN7eyPGrM5xlg1zp1fw/cU6euC1hWp+7NwN+A1Tn2ZY3W/8J0/wjcDywH7gN2qf59Ia3jDwNH5OET8ue1nLTO7zPQv4cNl/dAB9BJD2okllz+CPCZ/PoCXvnR+gbwfWD9/HgnoFpt8cqP94XAa4Hh1E4sjwI75XF+Bvw41+1FncSSX59SGbdQfxOv/FB8EugGtiP94/s5cFFVbOfnuN6av9BvrrOcLiQlvY3ztH8GjqkXZ9W0L9eTfpzfT/rR2TmXvRaYDxwNDAV2If2w7Vi9/Att9pRYVgHfJCWA4YWy0/LndhDwPDCyRrxvyu1vW6PuVOCP+fVU4PRC3XHAr/PrXUgJbA/Sj/uk/NkNK3yOc4BxwPA6yy3yerEYGJEfi3NZFMb7BXBuXo6bk/6kfDrXHQv8Kc9nM+A31E8s2wPvzctsNOmP0H9WrXu3kbYKNiP9kB7b02eeh98APJfbX5/0x6sbeE2zy6Oq/YmkBLBpnfr/BC6pU7cJKXlOI20Njqyq/wRwcwvr/+Gk7/BupD8l2wPbFL+veX14BDg4l7+RtM5vVfg+vr7dv3d9eXhXWDkWkr481V4EtiStOC9GxO8jrxkNnBIRz0XEijr1F0XEvRHxHPBvwIcqB/f76KPAWRHxUEQ8C5wIHFG1q+HUiFgREXcBd5ESzBpyLB8GToyI5RExDzgT+HgLsWwlaRnpn+wVwBcj4s5cdzAwLyJ+FBGrIuIOUoI9rKV3u6aXgJMjYmVhub8InJY/t2tI/0hr7Y4blZ8X1ahbVKi/GPhIoe7IXAbpH+y5EXFrRKyOdBxjJfC2wvjnRMT8BusFpB/PK0nL/whgRi4DQNIWpB/HL+R1bAlwdh4X4EOk5DA/Ip4k/TGqKSK6I2JmXmZLgbOAd1eNdk5ELMxtXUn6gW/Gh4Grc/svkraihgNvr2q7p+VR2ZV1EWndfbpG/XtJifyrtaaPiGeAd/DKH6ulkmbkZVlrfj2t/58C/iMiZkXSHRF/LTTxTtLnNikirsplq0kJfAdJ60fEvIj4S6P3PdCcWMoxhrRbptq3SP+0rpP0kKQpTbQ1v4X6v5L+0Y2qM24rtsrtFdseChS/QMUzdp4nbdlUGwW8pkZbY1qIZWFEjCD9WzwH2LtQtw2wRz7wuiwnoI8CfTnouzQi/lZV9kRErCoM13u/j+fnLWvUbVmovxEYLmkPSduQfmSvyHXbAMdXvadxpM+koqf1ouJC4Kj8uLCqbhvS+rKoMJ9zSVsu5PlVr1815QPkl0h6VNIzwI9Zez1sZn2pZY11MSJeynEV16Eel4ek4aSEdktErJUkJb2NlNwPi4g/12snIu6PiE9ExFjSFuBWpK2cWnpa/8cBjZLCscAfIuI3hfl3A18g7XVYkpf7VnWmHxScWPoonx0yBljrzJD8j+X4iNgOOAT4oqR9KtV1muxpi2Zc4fXWpH/Wj5N2HWxYiGsIaRdFs+0uJP3wFNteRdqd0orHc0zVbT3aYjtExErSvuX/I+kDuXg+8NuIGFF4bBQRn6lMVqOp5yksG9ZOQn05a+oB0vGYw4uF+UzBfwBuyO/lJWA6aavlSOCqiFheeE+nV72nDSPip72I8fekhLYFa6+T80lbQqMK89kkInbM9YtYe/2q5xs5prdExCbAx0i7dsqwxrooSTmu4jrUcHlIGkba7fco8Oka9TuTtgw+GRE3NBtYRPyJtCt1pzpx9LT+zycdn6rnWGBrSWdXzffiiHhHbjdIu24HLSeWXpK0iaSDgUtIxy7uqTHOwZK2z1+MZ0ibtJVThxeTjme06mOSdpC0IekYwOWRTkf+M7BBPpVyfdIBw2GF6RYD44unRlf5KfAvkrbNZ8/8O3Bp1b/2HuVYpgOnS9o4/zv/Iukfbcsi4gXSroTKroqrgDdI+rik9fNjN0lvzvW1lusc4EhJQyQdwNq7bHot79r8EnCSpCMlDZf0OuAHpC2u4g/ExaTdJB/lld1gkHaxHJu3ZiTptflzrHsKbA/xHAK8v3q3a0QsAq4Dzszr73qSXi+psjymA5+XNDafzdZoC3tj8gFrSWOAf2011gamA++TtE9el48nJcQ/NDNxnuZy0q7Uo3JSL9bvBPwa+KeIuLKHtt4k6fjKqeP51OiPALfkURYDYyW9Bppa/38AfEnSrvmz3j6PU7GcdLLNuySdkef5Rkl752T5t/y++nwJQjs5sbTuSknLSf88vkLat1zvOosJwPWkL+Afgf+JiJty3TdIP0bLJH2phflfRPrH9BiwAfB5gLz/+LOkFfdR0hbMgsJ0l+XnJyTdUaPdqbnt35HORvkb8E8txFX0T3n+D5H+NV+c2++tqaR/cYfkf/n7kY4LLCQth8qBd0hnPO2Ql+svctk/k35sK7vNfkGJIuJS0j70fyH9Y72PdExgz4h4ojDeraTlshXwq0L5bNJxlu+QTg/uJh0U7m08cyNibp3qo0i7au7L87qcV3bjnQ9cSzp+dgfpBI56TiUdZH4auLqHcVsSEQ+QtoD+m7Q8DyGd5v9Ck028nXQsbj9S4ns2P96Z648nbc3/sFBXb3ktJ51Ucauk50gJ5d7cBqRdnHOBxyRVdnvWXf8j4jLg9Fy2nLQurnF8NiKWkU5cOFDS10jr9hl5WTxG2nX55SaXxYBQ9Hgs2czMrHneYjEzs1I5sZiZWamcWMzMrFROLGZmVqp1sqPDUaNGxfjx4wc6DDOzjnL77bc/HhGjex6zsXUysYwfP57Zs2cPdBhmZh1FUt3eFlrRtl1hkjZQuj/FXZLmSjo1l28r6VZJD0q6tHJhkaRhebg7148vtHViLn9A0v7titnMzPquncdYVgJ7R8RbSf0iHZD75vkmcHZETCBdoHVMHv8Y4KmI2J50tfI3ASTtQLoYbkfSFan/o3I6XTQzszZoW2LJPXc+mwcr3cYHqUPBy3P5NKDSB9SheZhcv0/uCuVQUpfWKyPiYdJVybu3K24zM+ubtp4VlvtmmkO618RMUq+eywr9Ty3glV4/x5B7LM31TwN/VyyvMU1xXpMlzZY0e+nSpe14O2Zm1oS2JpZ8b4mJpFt/7k6649pao+XnWj2jRoPy6nmdFxFdEdE1enSfT2owM7Ne6pfrWHKnajeRblw0Qq/cPGosqSNBSFsi4+Dl+1hvSrrHycvlNaYxM7NBpp1nhY2WNCK/Hk665eb9pNudVu72N4l0C0/Id03Lrw8Dbszdfs8g3clwmKRtST0G39auuM3MrG/aeR3LlsC0fAbXesD0iLhK0n3AJZK+DtxJ6uac/HyRpG7SlsoRkLoAlzSd1M33KuC4fM8DMzMbhNbJbvO7urrCF0iambVG0u0R0dXXdtbJK+/NBqPxU65uarx5Z7yvzZGYtZc7oTQzs1I5sZiZWamcWMzMrFROLGZmVionFjMzK5UTi5mZlcqJxczMSuXEYmZmpXJiMTOzUjmxmJlZqZxYzMysVE4sZmZWKicWMzMrlROLmZmVyonFzMxK5cRiZmalcmIxM7NSObGYmVmpnFjMzKxUTixmZlYqJxYzMyuVE4uZmZXKicXMzErlxGJmZqVyYjEzs1K1LbFIGifpN5LulzRX0j/n8lMkPSppTn4cVJjmREndkh6QtH+h/IBc1i1pSrtiNjOzvhvaxrZXAcdHxB2SNgZulzQz150dEd8ujixpB+AIYEdgK+B6SW/I1d8F3gssAGZJmhER97UxdjMz66W2JZaIWAQsyq+XS7ofGNNgkkOBSyJiJfCwpG5g91zXHREPAUi6JI/rxGJmNgj1yzEWSeOBnYFbc9HnJN0taaqkkblsDDC/MNmCXFav3MzMBqG2JxZJGwE/A74QEc8A3wNeD0wkbdGcWRm1xuTRoLx6PpMlzZY0e+nSpaXEbmZmrWtrYpG0Pimp/CQifg4QEYsjYnVEvASczyu7uxYA4wqTjwUWNihfQ0ScFxFdEdE1evTo8t+MmZk1pZ1nhQn4IXB/RJxVKN+yMNoHgXvz6xnAEZKGSdoWmADcBswCJkjaVtJrSAf4Z7QrbjMz65t2nhW2J/Bx4B5Jc3LZl4GPSJpI2p01D/g0QETMlTSddFB+FXBcRKwGkPQ54FpgCDA1Iua2MW4zM+uDdp4VdjO1j49c02Ca04HTa5Rf02g6MzMbPHzlvZmZlcqJxczMSuXEYmZmpXJiMTOzUjmxmJlZqZxYzMysVE4sZmZWKicWMzMrlROLmZmVyonFzMxK5cRiZmalcmIxM7NSObGYmVmpnFjMzKxUTixmZlYqJxYzMyuVE4uZmZXKicXMzErlxGJmZqVyYjEzs1I5sZiZWamcWMzMrFROLGZmVionFjMzK5UTi5mZlcqJxczMSuXEYmZmpWpbYpE0TtJvJN0vaa6kf87lm0maKenB/Dwyl0vSOZK6Jd0taZdCW5Py+A9KmtSumM3MrO/aucWyCjg+It4MvA04TtIOwBTghoiYANyQhwEOBCbkx2Tge5ASEXAysAewO3ByJRmZmdng07bEEhGLIuKO/Ho5cD8wBjgUmJZHmwZ8IL8+FLgwkluAEZK2BPYHZkbEkxHxFDATOKBdcZuZWd/0yzEWSeOBnYFbgS0iYhGk5ANsnkcbA8wvTLYgl9Urr57HZEmzJc1eunRp2W/BzMya1PbEImkj4GfAFyLimUaj1iiLBuVrFkScFxFdEdE1evTo3gVrZmZ91tbEIml9UlL5SUT8PBcvzru4yM9LcvkCYFxh8rHAwgblZmY2CLXzrDABPwTuj4izClUzgMqZXZOAXxbKj8pnh70NeDrvKrsW2E/SyHzQfr9cZmZmg9DQNra9J/Bx4B5Jc3LZl4EzgOmSjgEeAQ7PddcABwHdwPPA0QAR8aSkrwGz8ninRcSTbYzbzMz6oG2JJSJupvbxEYB9aowfwHF12poKTC0vOjMzaxdfeW9mZqVyYjEzs1I5sZiZWamaOsYiaaeIuLfdwZh1mvFTrh7oEMwGnWa3WL4v6TZJn5U0oq0RmZlZR2sqsUTEO4CPki5UnC3pYknvbWtkZmbWkZo+xhIRDwInAScA7wbOkfQnSX/fruDMzKzzNJVYJL1F0tmkHor3Bg7J3eHvDZzdxvjMzKzDNHuB5HeA84EvR8SKSmFELJR0UlsiMzOzjtRsYjkIWBERqwEkrQdsEBHPR8RFbYvOzMw6TrPHWK4HhheGN8xlZmZma2g2sWwQEc9WBvLrDdsTkpmZdbJmE8tzknapDEjaFVjRYHwzM3uVavYYyxeAyyRVbrC1JfDh9oRkZmadrKnEEhGzJL0JeCOpK/w/RcSLbY3MzMw6Uiv3Y9kNGJ+n2VkSEXFhW6IyM7OO1WwnlBcBrwfmAKtzcQBOLGZmtoZmt1i6gB3yXR7NzMzqavassHuB17UzEDMzWzc0u8UyCrhP0m3AykphRLy/LVGZmVnHajaxnNLOIMzMbN3R7OnGv5W0DTAhIq6XtCEwpL2hmZlZJ2q22/x/BC4Hzs1FY4BftCsoMzPrXM0evD8O2BN4Bl6+6dfm7QrKzMw6V7OJZWVEvFAZkDSUdB2LmZnZGppNLL+V9GVgeL7X/WXAle0Ly8zMOlWziWUKsBS4B/g0cA3Q8M6RkqZKWiLp3kLZKZIelTQnPw4q1J0oqVvSA5L2L5QfkMu6JU1p5c2ZmVn/a/assJdItyY+v4W2LyDd0ri625ezI+LbxQJJOwBHADsCWwHXS3pDrv4u8F5gATBL0oyIuK+FOMzMrB8121fYw9Q4phIR29WbJiJ+J2l8k3EcClwSESuBhyV1A7vnuu6IeCjHcUke14nFzGyQaqWvsIoNgMOBzXo5z89JOgqYDRwfEU+RTl++pTDOglwGML+qfI9ajUqaDEwG2HrrrXsZmpmZ9VVTx1gi4onC49GI+E9g717M73ukXpInAouAM3O5as22QXmtGM+LiK6I6Bo9enQvQjMzszI0uytsl8LgeqQtmI1bnVlELC60eT5wVR5cAIwrjDoWqNytsl65mZkNQs3uCjuz8HoVMA/4UKszk7RlRCzKgx8k9ZoMMAO4WNJZpIP3E4DbSFssEyRtCzxKOsB/ZKvzNTOz/tPsWWHvabVhST8F9gJGSVoAnAzsJWkiaXfWPNKpy0TEXEnTSQflVwHHRcTq3M7ngGtJfZNNjYi5rcZiZmb9p9ldYV9sVB8RZ9Uo+0iNUX/YoI3TgdNrlF9Dum7GzMw6QCtnhe1G2mUFcAjwO9Y8Y8vMzKylG33tEhHLIV1BD1wWEZ9qV2BmZtaZmu3SZWvghcLwC8D40qMxM7OO1+wWy0XAbZKuIB14/yBrd9ViZmbW9Flhp0v6FfDOXHR0RNzZvrDMzKxTNbsrDGBD4JmI+C9gQb62xMzMbA3N3pr4ZOAE4MRctD7w43YFZWZmnavZLZYPAu8HngOIiIX0oksXMzNb9zWbWF6IiCB3ACnpte0LyczMOlmziWW6pHOBEZL+Ebie1m76ZWZmrxLNnhX27Xyv+2eANwJfjYiZbY3MzMw6Uo+JRdIQ4NqI2BdwMjEzs4Z63BWWexl+XtKm/RCPmZl1uGavvP8bcI+kmeQzwwAi4vNticrMzDpWs4nl6vwwMzNrqGFikbR1RDwSEdP6KyAzM+tsPR1j+UXlhaSftTkWMzNbB/SUWFR4vV07AzEzs3VDT4kl6rw2MzOrqaeD92+V9Axpy2V4fk0ejojYpK3RmZlZx2mYWCJiSH8FYmZm64ZW7sdiZmbWIycWMzMrlROLmZmVyonFzMxK5cRiZmalcmIxM7NStS2xSJoqaYmkewtlm0maKenB/Dwyl0vSOZK6Jd0taZfCNJPy+A9KmtSueM3MrBzt3GK5ADigqmwKcENETABuyMMABwIT8mMy8D1IiQg4GdgD2B04uZKMzMxscGpbYomI3wFPVhUfClR6Sp4GfKBQfmEktwAjJG0J7A/MjIgnI+Ip0h0sq5OVmZkNIv19jGWLiFgEkJ83z+VjgPmF8Rbksnrla5E0WdJsSbOXLl1aeuBmZtacwXLwXjXKokH52oUR50VEV0R0jR49utTgzMysef2dWBbnXVzk5yW5fAEwrjDeWGBhg3IzMxuk+juxzAAqZ3ZNAn5ZKD8qnx32NuDpvKvsWmA/SSPzQfv9cpmZmQ1Szd7zvmWSfgrsBYyStIB0dtcZwHRJxwCPAIfn0a8BDgK6geeBowEi4klJXwNm5fFOi4jqEwLMzGwQaVtiiYiP1Knap8a4ARxXp52pwNQSQzMzszYaLAfvzcxsHdG2LRYz653xU65uetx5Z7yvjZGY9Y63WMzMrFROLGZmVionFjMzK5UTi5mZlcqJxczMSuXEYmZmpXJiMTOzUjmxmJlZqZxYzMysVE4sZmZWKicWMzMrlROLmZmVyonFzMxK5cRiZmalcmIxM7NSObGYmVmpnFjMzKxUTixmZlYqJxYzMyuVE4uZmZXKicXMzErlxGJmZqVyYjEzs1INHegAzAab8VOuHugQzDragGyxSJon6R5JcyTNzmWbSZop6cH8PDKXS9I5krol3S1pl4GI2czMmjOQu8LeExETI6IrD08BboiICcANeRjgQGBCfkwGvtfvkZqZWdMG0zGWQ4Fp+fU04AOF8gsjuQUYIWnLgQjQzMx6NlCJJYDrJN0uaXIu2yIiFgHk581z+RhgfmHaBblsDZImS5otafbSpUvbGLqZmTUyUAfv94yIhZI2B2ZK+lODcVWjLNYqiDgPOA+gq6trrXozM+sfA7LFEhEL8/MS4Apgd2BxZRdXfl6SR18AjCtMPhZY2H/RmplZK/o9sUh6raSNK6+B/YB7gRnApDzaJOCX+fUM4Kh8dtjbgKcru8zMzGzwGYhdYVsAV0iqzP/iiPi1pFnAdEnHAI8Ah+fxrwEOArqB54Gj+z9kMzNrVr8nloh4CHhrjfIngH1qlAdwXD+EZmZmJRhMpxubmdk6wInFzMxK5cRiZmalcmIxM7NSObGYmVmpnFjMzKxUTixmZlYqJxYzMyuVE4uZmZXKtyY262DN3kZ53hnva3MkZq/wFouZmZXKicXMzErlxGJmZqVyYjEzs1I5sZiZWal8Vpi9ajR7BpWZ9Y23WMzMrFROLGZmVionFjMzK5UTi5mZlcqJxczMSuWzwsxeBVo5I879illfeYvFzMxK5cRiZmalcmIxM7NS+RiLdTRfTV8+H4+xvvIWi5mZlapjtlgkHQD8FzAE+EFEnDHAIVkbeUukM/gOllZLRyQWSUOA7wLvBRYAsyTNiIj7BjYya4WTxauXd6+9unREYgF2B7oj4iEASZcAhwJOLC3wD7t1goFeT53Y+q5TEssYYH5heAGwR3EESZOByXlwpaR7+ym2vhgFPD7QQTTBcZbLcZar1Dj1zbJaWkOnLMs3ltFIpyQW1SiLNQYizgPOA5A0OyK6+iOwvnCc5XKc5XKc5emEGCHFWUY7nXJW2AJgXGF4LLBwgGIxM7MGOiWxzAImSNpW0muAI4AZAxyTmZnV0BG7wiJilaTPAdeSTjeeGhFzG0xyXv9E1meOs1yOs1yOszydECOUFKciouexzMzMmtQpu8LMzKxDOLGYmVmpOjaxSNpM0kxJD+bnkTXGmSjpj5LmSrpb0ocLddtKujVPf2k+KWBA4szj/VrSMklXVZVfIOlhSXPyY+IgjXOwLc9JeZwHJU0qlN8k6YHC8ty8xNgOyG13S5pSo35YXjbdeVmNL9SdmMsfkLR/WTGVGaek8ZJWFJbd9wc4zndJukPSKkmHVdXV/PwHYZyrC8uzrSckNRHnFyXdl38rb5C0TaGuteUZER35AP4DmJJfTwG+WWOcNwAT8uutgEXAiDw8HTgiv/4+8JmBijPX7QMcAlxVVX4BcNhgWJ49xDloliewGfBQfh6ZX4/MdTcBXW2IawjwF2A74DXAXcAOVeN8Fvh+fn0EcGl+vUMefxiwbW5nSJuWX1/iHA/c2+51sYU4xwNvAS4sfkcaff6DKc5c9+wgWp7vATbMrz9T+NxbXp4du8VC6tJlWn49DfhA9QgR8eeIeDC/XggsAUZLErA3cHmj6fsrzhzfDcDyNsXQjF7HOQiX5/7AzIh4MiKeAmYCB7QpnoqXux2KiBeASrdDRcXYLwf2ycvuUOCSiFgZEQ8D3bm9wRZnf+oxzoiYFxF3Ay9VTdufn39f4uxPzcT5m4h4Pg/eQrpeEHqxPDs5sWwREYsA8nPDXRqSdidl6r8Afwcsi4hVuXoBqduYAY+zjtPz5unZkoaVG97L+hLnYFuetboAKsbzo7zr4d9K/MHsaZ5rjJOX1dOkZdfMtGXpS5wA20q6U9JvJb2zTTE2G2c7pm1VX+e1gaTZkm6R1K4/Y9B6nMcAv+rltIP7OhZJ1wOvq1H1lRbb2RK4CJgUES/V+THp9XnXZcVZx4nAY6SkeB5wAnBabxpqY5yDbXk2iuejEfGopI2BnwEfJ+2i6KtmlkG9cUpdfj3oS5yLgK0j4glJuwK/kLRjRDxTdpANYmj3tK3q67y2joiFkrYDbpR0T0T8paTYipqOU9LHgC7g3a1OWzGoE0tE7FuvTtJiSVtGxKKcOJbUGW8T4GrgpIi4JRc/DoyQNDT/I+tTFzFlxNmg7UX55UpJPwK+NAjjHGzLcwGwV2F4LOnYChHxaH5eLuli0i6CMhJLM90OVcZZIGkosCnwZJPTlqXXcUba4b4SICJul/QX0nHMUvqX6kWcjabdq2ram0qJqva8ev3Z5V30RMRDkm4CdibtVSlbU3FK2pf0B+7dEbGyMO1eVdPe1GhmnbwrbAZQOTthEvDL6hGUzky6ArgwIi6rlOOw2KcAAAUySURBVOcvyG+AwxpN319xNpJ/PCvHMT4AtKvX5l7HOQiX57XAfpJGKp01th9wraShkkYBSFofOJjylmcz3Q4VYz8MuDEvuxnAEflsrG2BCcBtJcVVWpySRivdG4n8D3sC6UDuQMVZT83Pf7DFmeMbll+PAvakfbcC6TFOSTsD5wLvj4jiH7bWl2d/nJHQjgdpn+8NwIP5ebNc3kW6wyTAx4AXgTmFx8Rctx3py9sNXAYMG6g48/DvgaXACtI/hP1z+Y3APaQfwB8DGw3SOAfb8vxkjqUbODqXvRa4HbgbmEu+I2mJsR0E/Jn0j/Mruew00hcVYIO8bLrzstquMO1X8nQPAAe2+bvTqziBf8jL7S7gDuCQAY5zt7wOPgc8Acxt9PkPtjiBt+fv9l35+ZgBjvN6YDGv/FbO6O3ydJcuZmZWqk7eFWZmZoOQE4uZmZXKicXMzErlxGJmZqVyYjEzs1I5sVhHKfQGe6+kKyWN6GH8EZI+2+aYdpR0o6Q/595fy+wqpjif8ZJC0tcKZaMkvSjpOy22Na9wTc8fyo7VXt2cWKzTrIiIiRGxE+mq9eN6GH8EqbfellQuBGxivOGkC83OiIg3AG8lXZ/Q52SWr3qv9hDpws6Kw0nXlvRaRLy9L9ObVXNisU72Rwqd4Un6V0mzcoedp+biM4DX562cb0naS4V7yUj6jqRP5NfzJH1V0s3A4Ur3bvmmpNvy1kitThePBP43Iq4DiNQ77OeAKZLWy22OKMyvW9IW+Sr2n+V4Z0naM9efIuk8SddRu6uZFcD9krry8IdJtyyotF+v3b+TdJ1SB5LnUuj/SdKz+Xkjpftw3CHpHkmH5vLxku6XdL7SvY2uywnVrCYnFutIeYtiH3K3FJL2I3UxsjswEdhV0rtI92z5S97K+dcmmv5bRLwjIi7Jw0MjYnfgC8DJNcbfkXQ1/8sidSK4UX78EvhgjnEPYF5ELCZd9X92ROxGuqL9B4UmdgUOjYgj68R4CakLmLHAatbs86leuycDN0fEzqRltnWt9w58MCJ2Id2b48zCLr0JwHcjYkdgWW7brKZB3QmlWQ3DJc0h3TzpdtK9ISD1X7QfcGce3oj0Y/hIi+1fWjX88/x8e55nNVG/p9fI7X0V+BH5plm5bl9gh8KhmE2UelyG1JXGigYx/hr4Gqn7jep467X7LuDvASLiaklP1Xkv/54T8kukrcEtct3DETEnv663LMwAJxbrPCsiYqKkTYGrSMdYziH9KH4jIs4tjqzC7X+zVay5pb5BVf1zVcOVHl5XU/v7Mpf0o12c53akOwMul/RHYHtJo0mdiH49j7Ye8H+rE0hOCNUxrCEiXpB0O3A8aYvpkEJ1o3Z76r/po8BoYNeIeFHSPF5ZPisL460GvCvM6vKuMOtIEfE08HngS0o9FV8LfFLSRgCSxijdz345sHFh0r+S/tEPy8lpnz6G8hPgHUrdjVcO5p9DuoUykTrjuwI4C7g/Ip7I011HOhZDnm5ii/M9Ezih0F5FvXZ/R0ocSDqQdIvZapsCS3JSeQ+wTY1xzHrkxGIdKyLuJPUMe0Q+eH4x8EdJ95Buqbtx/uH933x68rciYj7pYPfdpKRwZ53mm41hBekWrydJeoDUS+0soHj676WknraLu60+D3TlEw3uA45tcb5zI2Jajap67Z4KvEvSHaRdhrV2Ef4kTzublIT+1EpMZhXu3djMzErlLRYzMyuVE4uZmZXKicXMzErlxGJmZqVyYjEzs1I5sZiZWamcWMzMrFT/H5tG/Uy0rulGAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "s2 = pd.Series(excess1.iloc[:-1, :].values.flatten())\n",
    "ax2 = s2.plot.hist(bins=50)\n",
    "ax2.set_xlim(-0.2,0.2)\n",
    "ax2.set_xlabel('Return Over Median')\n",
    "ax2.set_title('Distribution of Return Over Median for 22 Stocks')\n",
    "plt.savefig('distribution_over_median.png')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instead of returning the numerical value of excess return over median, we can also simply return the sign. Using categorical rather than numerical labels alleviates problems that can arise due to extreme outlier returns [Zhu et al. 2012]."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "c:\\users\\ruifan\\mlfinlab\\mlfinlab\\labeling\\excess_over_median.py:41: RuntimeWarning: invalid value encountered in sign\n",
      "  return np.sign(returns_over_median)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AMD</th>\n",
       "      <th>FB</th>\n",
       "      <th>BABA</th>\n",
       "      <th>NVDA</th>\n",
       "      <th>MSFT</th>\n",
       "      <th>AAL</th>\n",
       "      <th>C</th>\n",
       "      <th>JPM</th>\n",
       "      <th>COST</th>\n",
       "      <th>F</th>\n",
       "      <th>...</th>\n",
       "      <th>UA</th>\n",
       "      <th>ZM</th>\n",
       "      <th>UBER</th>\n",
       "      <th>NOK</th>\n",
       "      <th>VZ</th>\n",
       "      <th>AAPL</th>\n",
       "      <th>PFE</th>\n",
       "      <th>CVX</th>\n",
       "      <th>SYY</th>\n",
       "      <th>CCL</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2019-01-22</th>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-23</th>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-24</th>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-25</th>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-28</th>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 22 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            AMD   FB  BABA  NVDA  MSFT  AAL    C  JPM  COST    F  ...   UA  \\\n",
       "Date                                                              ...        \n",
       "2019-01-22  1.0 -1.0  -1.0   1.0   1.0 -1.0  1.0 -1.0  -1.0 -1.0  ... -1.0   \n",
       "2019-01-23  1.0  1.0   1.0   1.0  -1.0  1.0 -1.0 -1.0  -1.0  1.0  ...  1.0   \n",
       "2019-01-24  1.0  1.0   1.0  -1.0  -1.0  1.0  1.0 -1.0  -1.0  1.0  ...  1.0   \n",
       "2019-01-25 -1.0 -1.0   1.0  -1.0  -1.0  1.0  1.0  1.0   1.0 -1.0  ...  1.0   \n",
       "2019-01-28 -1.0 -1.0  -1.0  -1.0  -1.0 -1.0 -1.0  1.0   1.0  1.0  ... -1.0   \n",
       "\n",
       "            ZM  UBER  NOK   VZ  AAPL  PFE  CVX  SYY  CCL  \n",
       "Date                                                      \n",
       "2019-01-22 NaN   NaN  1.0  1.0   1.0 -1.0 -1.0  1.0 -1.0  \n",
       "2019-01-23 NaN   NaN  1.0 -1.0  -1.0 -1.0  1.0 -1.0  1.0  \n",
       "2019-01-24 NaN   NaN  1.0 -1.0   1.0 -1.0 -1.0 -1.0 -1.0  \n",
       "2019-01-25 NaN   NaN -1.0 -1.0   1.0 -1.0 -1.0  1.0  1.0  \n",
       "2019-01-28 NaN   NaN  1.0 -1.0  -1.0  1.0  1.0  1.0  1.0  \n",
       "\n",
       "[5 rows x 22 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "excess2 = excess_over_median(data, binary=True)\n",
    "excess2.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can verify that the number of positive labels matches the number of negative labels. Note: for larger data sets, there is increased chance that some tickers will have the exact same return for a given time index. If that return is also equal to the the median, the number of positive labels may not match exactly with the number of negatives, but should be very close."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-1.0    3629\n",
       " 1.0    3629\n",
       " 0.0      19\n",
       "dtype: int64"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "excess2.stack().value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## Conclusion"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook presents the method to label data according to excess return over median. This method can return either numerical or categorical labels for observations. Zhu et al. utilize these labels to predict monthly stock returns using linear regression and decision trees based on composite features as independent variables. In this process:\n",
    " - Forward rates of return for assets are calculated for the entire selection of stocks indexed by time bars.\n",
    " - At each time index, the median rate of return for all stocks is calculated. The median is subtracted from each stock's return to find the excess return over median.\n",
    " - If categorical labels are desired, the excess returns are converted to their signs.\n",
    "\n",
    "This method is useful for labelling data used for training regression models and decision trees. Zhu et al. found that decision trees were slightly better at predicting outperformers than linear regression."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. Zhu, M., Philpotts, D. and Stevenson, M., 2012. The benefits of tree-based models for stock selection. Journal of Asset Management, [online] 13(6), pp.437-448. Available at: <https://link.springer.com/article/10.1057/jam.2012.17>."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
